feat: Client-tool execution + permission system#3370
feat: Client-tool execution + permission system#3370BinaryMuse wants to merge 38 commits intomainfrom
Conversation
…oundary This change extracts four new abstractions and sets a clear architectural rule that will guide subsequent refactoring: **Mutation boundary (Phase 1a):** AppState no longer owns an mpsc::Sender. State is passive data; the event loop (or stream task) is the only thing that emits events. handle_client_tool_call is now a pure mutation — the CheckToolCallPermission event is sent by the caller. **AppContext (Phase 1b):** Session-scoped API configuration (endpoint, token, send_cwd) is now a single Clone struct instead of three individually-cloned bindings in the event loop. **ClientContext + ChatRequest (Phase 1c):** Machine identity (OS, shell, distro) is computed once per session via ClientContext::detect() instead of on every SSE request. ChatRequest wraps the per-turn message/session payload, replacing the inline request body construction in create_chat_stream. **ToolDescriptor (Phase 1d):** Centralizes tool metadata — canonical names, display verbs, progressive/past verbs, and client/server classification — into static descriptors. Replaces four separate name-to-text match sites and fixes a bug where is_client was checked against 'file_read' (wrong) instead of 'read_file' (correct). Also fixes several clippy warnings in modified code.
…rames **State decomposition (Phase 2a):** AppState is replaced by three types with clear ownership: - Conversation: owns the event log and session_id. All pure query and event-manipulation methods (events_to_messages, current_command, has_any_command, etc.) move here. - Interaction: owns ephemeral UI state (mode, is_input_blank, confirmation_pending, streaming_status, error). - Session: the top-level type containing conversation, interaction, pending_tool_calls, exit_action, and stream_abort. Cross-cutting lifecycle methods (start_streaming, cancel_streaming, add_tool_call, etc.) stay here. **Dispatch extraction (Phase 2b):** The 400-line match event in the spawn_blocking loop is now a 5-line dispatch call. All 12 handlers are named functions in a new tui/dispatch.rs module. inline.rs shrinks from ~640 lines to ~240 lines. **Stream launch centralization (Phase 2c):** The duplicated 8-line stream launch ritual (present in ContinueAfterTools, SubmitInput, and Retry) is replaced by a single launch_stream function that takes a setup callback for pre-work. Each handler collapses to one line. **Stream frame split (Phase 2d):** ChatStreamEvent is replaced by StreamFrame with explicit Content (TextChunk, ToolCall, ToolResult) and Control (Done, Error, StatusChanged) variants. run_chat_stream now dispatches on frame type, with apply_content_frame and apply_control_frame as separate functions.
Extracts three abstractions that untangle the ~180-line on_check_tool_permission handler: **PermissionResolver (permissions/resolver.rs):** Composes the PermissionWalker + PermissionChecker into a single new/check API. The handler no longer imports Walker, Checker, or Request directly. **ToolOutcome + ClientToolCall::execute (tools/mod.rs):** Each tool variant owns its execution logic. ReadToolCall::execute handles both directory listing and file reading (previously only directory listing worked — file reading was a no-op). Returns ToolOutcome::Success or ToolOutcome::Error, replacing the inline match arms. **PendingToolCall state transitions (tools/mod.rs):** mark_asking(), mark_executing(), mark_denied() methods formalize the state machine instead of direct enum variant assignment. **Session::complete_tool_call (tui/state.rs):** Combines add_tool_result + pending_tool_calls.retain into one method, replacing the repeated cleanup pattern in the handler. The handler drops from ~180 lines to ~60 lines.
Add the history database to AppContext and plumb it through dispatch so client tools can run async database queries. AtuinHistory::execute uses atuin-client's Database::search with fuzzy matching, the first filter mode from the tool call, and configurable limit. Also fix two pre-existing bugs that prevented client tools from working end-to-end: AtuinHistory::matches_rule had a todo!() panic that crashed the permission check task, and on_select_permission Allow was discarding the tool call instead of executing it.
Format results with local timezone timestamp and human-readable duration (e.g. 3s, 1m23s, 120ms) alongside command, cwd, and exit code.
Parse shell commands with tree-sitter-bash (POSIX family) and tree-sitter-fish to extract all subcommands from compound expressions (&&, ||, pipes, subshells, $(...), etc). Wire into ShellToolCall::matches_rule for scope matching. Scope matching supports three wildcard styles: - `ls *` (space before *): word-boundary match - `ls*` (no space): prefix/glob match - `git * amend` (middle *): matches zero+ words between segments Also fixes: variable assignments excluded from command text, fallback parser double-split bug, and shell field parsed from tool call input for correct parser selection.
Add 77 adversarial tests covering nested substitutions, variable assignments, control flow, redirections, subshells, background jobs, real-world commands, fish-specific syntax, and scope matching edge cases. Fix: remove `concatenation` from BASH_LEAVES so that subcommands inside argument concatenations (e.g. `make -j$(nproc)`) are properly extracted. Known limitations verified by tests: - `find -exec` body is opaque to tree-sitter (not parsed as commands) - `[` in test conditions is not extracted as a command - `eval`/`exec`/`source` argument bodies are not recursively parsed
Shell tool calls now execute locally with a streaming VT100 preview in the TUI. The full stdout and stderr are captured separately and sent to the LLM as structured results with exit code and duration. Key changes: - ToolOutcome::Structured variant with separated stdout/stderr/exit code/duration - execute_shell_command_streaming uses vt100 crate for ANSI/progress bar handling - Shared execute_tool dispatch eliminates duplicated shell execution paths - begin_tool_call/finish_tool_call for ToolCall persistence in chat output - Ctrl+C interrupts running commands instead of exiting the app - Viewport component in eye_declare for fixed-height tail rendering
Replace the split state model (pending_tool_calls VecDeque + preview field on ConversationEvent::ToolCall) with a unified ToolTracker that owns each tool call through its full lifecycle, including after completion. This eliminates the preview copy-back dance, the two-place preview lookup, and the TurnBuilder second-pass update_previews step. Key changes: - New ToolTracker/TrackedTool/ToolPhase types replace PendingToolCall/ToolCallState - ConversationEvent::ToolCall drops its preview field (now purely API-facing) - shell_abort_tx moves from Session to TrackedTool.abort_tx (per-tool, not per-session) - TurnBuilder takes &ToolTracker reference, looks up previews inline - Fix spinner not animating during shell preview (work around eye_declare interval reset by computing frame from system clock) - Fix word wrapping in shell output preview (use truncation instead of word-boundary wrapping for VT100 content) - Use multi-thread tokio runtime for AI commands
# Conflicts: # crates/atuin-ai/src/commands/inline.rs
|
@greptileai Please review this draft |
Greptile SummaryThis PR decomposes the AI chat TUI into
Confidence Score: 4/5Safe to merge with one security concern: unadvertised tools can be requested and queued for permission prompts by a non-compliant server. All three previously-flagged P0/P1 issues are resolved. One new P1 remains: the capability gating on the receive path. The permission system's default-Ask behavior limits the blast radius, but the advertise-vs-enforce gap is a real correctness issue worth fixing before shipping to users who believe certain tools are disabled. crates/atuin-ai/src/stream.rs — apply_content_frame capability check
|
| Filename | Overview |
|---|---|
| crates/atuin-ai/src/stream.rs | SSE framing and content dispatch — accepted tool names include read_file, create_file, execute_shell_command regardless of advertised capabilities |
| crates/atuin-ai/src/tools/mod.rs | Tool execution lifecycle, ToolTracker, shell streaming — previously-flagged todo!/subtraction issues are resolved; path_matches_scope handles Read/Write scopes correctly |
| crates/atuin-ai/src/tui/dispatch.rs | Event dispatch — previously-flagged AlwaysAllow stubs are now implemented, writing rules and executing tools correctly |
| crates/atuin-ai/src/tui/state.rs | Domain state and conversation event log — events_to_messages correctly handles mixed text+tool_call content blocks |
| crates/atuin-ai/src/permissions/check.rs | Permission checker with ask→deny→allow priority ordering — default is Ask, which is the correct safe fallback |
| crates/atuin-ai/src/permissions/writer.rs | Persists allow/deny rules to TOML files — correctly deduplicates and preserves formatting; acknowledged as not concurrent-safe but serialized by UI |
Reviews (3): Last reviewed commit: "feedback" | Re-trigger Greptile
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
| if let Ok(tool) = ClientToolCall::try_from((name.as_str(), &input)) { | ||
| // Client-side tool — add to tracker and conversation, queue permission check | ||
| let id_for_event = id.clone(); | ||
| handle.update(move |state| { | ||
| state.handle_client_tool_call(id_for_event, tool, input); | ||
| }); | ||
| let _ = tx.send(AiTuiEvent::CheckToolCallPermission(id)); | ||
| } else { | ||
| // Server-side tool — just add to conversation events | ||
| handle.update(move |state| { | ||
| state.add_tool_call(id, name, input); | ||
| }); | ||
| } |
There was a problem hiding this comment.
Capability gating is advertise-only, not enforced on receive
ClientToolCall::try_from accepts read_file, create_file, and execute_shell_command regardless of what was sent in the capabilities array. Only client_v1_atuin_history is in the default capabilities, but a server (compromised or otherwise) can still request the other tools and the client will parse and queue them for permission-checking. The PR description says these tools are "gated behind capability flags" — that's only true from the server's perspective.
The permission system defaults to Ask, so execution still requires user confirmation. But it means a user who believes shell execution is disabled could be prompted to grant it unexpectedly.
Consider filtering in apply_content_frame against the advertised capabilities, or add a ClientToolCall::is_capability_enabled check before queuing the permission event.
Summary
Adds client-side tool execution to Atuin AI, starting with
atuin_history. The server can request tool calls, which are executed locally with a permission system, and results are sent back to continue the conversation.Architecture
inline.rsintostream.rs(SSE framing),dispatch.rs(event handling),state.rs(domain state),tools/(tool types + execution), andpermissions/(rule resolution)pending_tool_callsandConversationEventpreview fields. Tracks each tool from permission check through completion, including cached previews for shell command outputgit *allows all git subcommands). Supports auto-allow, deny, and interactive ask with an inline selection UIViewportwidget, with interrupt support (Ctrl+C)Ships with
atuin_historytool active by default — lets the AI search your command historyai.openingsettings (send_cwd, send_last_command)